Inducing Probabilistic Syllable Classes Using Multivariate Clustering
نویسندگان
چکیده
An approach to automatic detection of syllable structure is presented. We demonstrate a novel application of EM-based clustering to multivariate data, exempli ed by the induction of 3and 5-dimensional probabilistic syllable classes. The qualitative evaluation shows that the method yields phonologically meaningful syllable classes. We then propose a novel approach to grapheme-to-phoneme conversion and show that syllable structure represents valuable information for pronunciation systems.
منابع مشابه
Inducing Probabilistic Syllable Classes Using Multivariate Clustering -gold
An approach to automatic detection of syllable structure is presented. We demonstrate a novel application of EM-based clustering to multivariate data, exempliied by the induction of 3-and 5-dimensional probabilistic syllable classes. The 3-dimensional models were subjected to a pseudo-disambiguation task, the result of which shows that the onset is the most variable, or least predictable, part ...
متن کاملProbabilistic Landslide Risk Analysis and Mapping (Case Study: Chehel-Chai Watershed, Golestan Province, Iran)
The efficiency of three statistical models, AHP surface-weighted density bivariate (semi-quantitative models), stepwise multivariate regression and logistic multivariate regression models were compared in Chehel-Chai watershed in Golestan province, Iran. In current study the hazard map was prepared according to the top model of landslide hazard map. Chehel-Chai watershed is located as one of Go...
متن کاملInducing German Semantic Verb Classes from Purely Syntactic Subcategorisation Information
The paper describes the application of kMeans, a standard clustering technique, to the task of inducing semantic classes for German verbs. Using probability distributions over verb subcategorisation frames, we obtained an intuitively plausible clustering of 57 verbs into 14 classes. The automatic clustering was evaluated against independently motivated, handconstructed semantic verb classes. A ...
متن کاملEfficient induction of probabilistic word classes with LDA
Word classes automatically induced from distributional evidence have proved useful many NLP tasks including Named Entity Recognition, parsing and sentence retrieval. The Brown hard clustering algorithm is commonly used in this scenario. Here we propose to use Latent Dirichlet Allocation in order to induce soft, probabilistic word classes. We compare our approach against Brown in terms of effici...
متن کاملApplication of Probabilistic Clustering Algorithms to Determine Mineralization Areas in Regional-Scale Exploration Studies
In this work, we aim to identify the mineralization areas for the next exploration phases. Thus, the probabilistic clustering algorithms due to the use of appropriate measures, the possibility of working with datasets with missing values, and the lack of trapping in local optimal are used to determine the multi-element geochemical anomalies. Four probabilistic clustering algorithms, namely PHC,...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2000